Oriented k-windows: A PCA driven clustering method
نویسندگان
چکیده
Technical Report HPCLAB-SCG 2/02-06 (revised 3/06) LABORATORY: High Performance Information Systems Laboratory GRANTS: University of Patras “Karatheodori” Grant; Bodossaki Foundation Graduate Fellowship. REPOSITORY http://scgroup.hpclab.ceid.upatras.gr REF: In Advances in Web Intelligence and Data Mining, M. Last, P.S. Szcepaniak, Z. Volkovich and A. Kandel eds., Springer Studies in Computational Intelligence, v. 23, pp. 319-328, Springer, 2006.
منابع مشابه
Influential Features Pca for High Dimensional Clustering
We consider a clustering problem where we observe feature vectors Xi ∈ R, i = 1, 2, . . . , n, from K possible classes. The class labels are unknown and the main interest is to estimate them. We are primarily interested in the modern regime of p n, where classical clustering methods face challenges. We propose Influential Features PCA (IF-PCA) as a new clustering procedure. In IF-PCA, we select...
متن کاملImportant Features PCA for high dimensional clustering
We consider a clustering problem where we observe feature vectors Xi ∈ R, i = 1, 2, . . . , n, from K possible classes. The class labels are unknown and the main interest is to estimate them. We are primarily interested in the modern regime of p n, where classical clustering methods face challenges. We propose Important Features PCA (IF-PCA) as a new clustering procedure. In IFPCA, we select a ...
متن کاملComparing k-means clusters on parallel Persian-English corpus
This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing. So far, much English documents clustering research has been accomplished. Now this question arises, are the results of them extendable to other languages? Since the goal of document clustering is grouping of docum...
متن کاملViDaExpert: user-friendly tool for nonlinear visualization and analysis of multidimensional vectorial data
ViDaExpert is a tool for visualization and analysis of multidimensional vectorial data. ViDaExpert is able to work with data tables of ”object-feature” type that might contain numerical feature values as well as textual labels for rows (objects) and columns (features). ViDaExpert implements several statistical methods such as standard and weighted Principal Component Analysis (PCA) and the meth...
متن کاملLearning Taxonomy for Text Segmentation by Formal Concept Analysis
In this paper the problems of deriving a taxonomy from a text and concept-oriented text segmentation are approached. Formal Concept Analysis (FCA) method is applied to solve both of these linguistic problems. The proposed segmentation method offers a conceptual view for text segmentation, using a context-driven clustering of sentences. The Concept-oriented Clustering Segmentation algorithm (COC...
متن کامل